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ABSTRACT 

In the framework of a 4m class Solar Telescope we 
studied the performance of the MCAO using the 
LOST simulation package. In particular, in this 
work we focus on two different methods to reduce 
the time delay error which is particularly critical 
in solar adaptive optics: a) the optimization of the 
wavefront reconstruction by reordering the modal 
base on the basis of the Mutual Information and 
b) the possibility of forecasting the wavefront cor- 
rection through different approaches. We evaluate 
these techniques underlining pros and cons of their 
usage in different control conditions by analyzing 
the results of the simulations and make some pre- 
liminary tests on real data. 

Keywords: Active or adaptive optics; Wave-front 
sensing 

1. INTRODUCTION 

In the next years the study of the Sun will be sup- 
ported by the presence of a new class of large Solar 
telescopes (ATST, EST) that will shed new light 
over the small scale dynamics of the Sun. This 
class of telescopes will need efficient Adaptive Op- 
tics systems to achieve the corrected Field of View 
(~ 2') and high spatial resolution (~ 0.05") which 
are requested to chase after the major scientific 
targets. 

In particular, the absence of point-like sources on 
the solar surface slowes down the estimation of the 
wavefront and reduces the adaptive correction ef- 
ficiency. This is a very critical issue which has to 
be deeply analyzed in the Solar case. 
In this work, we will focus on this point by showing 
a comparison between different approaches aimed 
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to reduce the time lag in the MCAO correction 
loop. 

Making use of both simulations and tests per- 
formed on real data sets, we study the behavior 
of different kinds of modal coefficients prediction 
tools based upon linear filters, neural networks and 
ARMA (Auto-Regressive Moving Average) pro- 
cesses. 

We also show how the wavefront modal decompo- 
sition can be successfully optimized using informa- 
tion theory to speed up calculations and reduce the 
amount of information to process. 

2. BASE OPTIMIZATION 

2.1 Mutual Information Reordering 

In the modal representation of the incoming wave- 
front it is necessary to both minimize the recon- 
struction error and to keep the number of modes 
in the representation basis low, allowing us to save 
computational timeP Therefore, the optimal basis 
is the one that can reproduce the wavefront with 
the smallest number of modes. The Karhunen- 
Loeve (K-L) functions are the most popular choice, 
since they are ordered by covariance in a Princi- 
pal Component Analysis (PC A) approacrP^ an d 
in this sense usually perform much better than 
Zernike polynomials P^l We present an alternative 
tool to choose the best ordering for a modal ba- 
sis which makes use of the concept of Mutual In- 
formation (MI). MI has been used extensively in 
many applications involving relevant information 
extraction^ or in constructing a metric for opti- 
mizing recognition tasks P We have found that MI 
also allows us to reduce the basis dimension while 
maintaining the same error in the wavefront recon- 
struction with respect to covariance orderingP 
First, we introduce the MI concept: let us con- 
sider an information channel clS Sl connection be- 
tween source and receiver. The output Y = 



{UO: Vii 2/2, Vj-i} of this channel (selected from 
alphabet Y) can be thought of as a distorted ver- 
sion of input message X = {xq,xi, X2, xk-i}, 
selected through alphabet X. 
The Shannon information entropy H(X), 9 is a 
measure of the a priori uncertainty about X, or 
even the average information content per source 
symbol: 



K-l 



K-l 



H{X) = E[I{X)\ = Y, Pkl{x k ) = Pklog{l/Pk) 
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where pt is the probability of occurrence of the 
event x^, /(•) is the information brought by the 
message and E[- ] is the expectation value. 
We are interested in the measure of the uncer- 
tainty about the input X after observing the out- 
put Y, that is we are interested in the estimate of 
the amount of information transferred through the 
channel. 

To answer this question we can use the Mutual 
Information, defined as: 



I(X; Y) = H{X) - H{X\ Y) 



(2) 



where H (X\ Y) is the conditional information en- 
tropy given by: 
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P(xk\yj) 



(3) 



where p(xk,yj) is the joint probability distribution 
function, and x^ and yj are the input and the 
output respectively. 

MI quantifies the reduction in the amount of 
uncertainty of the message X, knowing the output 
Y. We can consider the phase reconstruction pro- 
cess as an information transfer from the unknown 
wavefront phase (source) to the polynomial basis. 



In this framework the polynomial basis is the 
alphabet Y through which we can describe our 
wavefront phase, the reconstructed phase is a 
noisy version of the true phase and MI quantifies 
the amount of information transferred to the 
reconstructed phase. 

We can therefore use MI to quantify the capability 
of every mode (Zernike polynomial or K-L func- 
tion) to add information on the reconstruction 
of the incoming wavefront. We are then able 
to reorder the modal basis according to the MI 
added by each mode so that the modes which 
carry the most MI come first. 

We have tested this method on Vacuum Tower 
Telescope^"m closed loop wavefront sensor data 
. The dataset analyzed in this work consists of a 
time series of Zernike coefficients acquired on Oc- 
tober 13th 2008, with the KAOS^ wavefront sen- 
sor. This wavefront sensor is characterized by 36 
subapertures and allows us to perform a wavefront 
reconstruction with up to 27 Zernike modes. 
We benchmarked the MI reordering against the 
standard ordering via a fitting error analisys, i.e. 
we estimated the residual errors of the phase re- 
constructed with an increasing number of modes. 
The residual error is defined as: 



A$(n) = |$(n) - ^reference] 



(4) 



where <&(n) is the wavefront reconstructed using 
the first n modes (1 < n < 27) of the chosen basis 
and reference = $(27) is the best representation 
of the real wavefront phase we have, i.e. the re- 
construction with the full 27 Zernikes set. Using 
this descriptor we tested three bases: the standard 
Zernike basis, a K-L basis reordered by MI and a 
Zernike basis reordered by MI. The fitting error 
plots for a tipical wavefront representation from 
the dataset are presented in figured] as an example 
of the results. In order to ease the interpretation of 
the plot, we recall that the steeper the fitting error 
function, the more performing the modal basis is, 
since it allows us to use less modes while maintain- 
ing the same error in the wavefront representation. 
In this example, we can see that a fitting error of 
~ 0.4 waves 2 can be obtained by reconstructing 




Number of modes used 



Figure 1. Wavefront fitting error vs number of modes used. Zernike polynomials standard ordering (dotted blue); 
K-L functions MI ordering (dashed green); Zernike polynomials MI ordering (continuos red). Results obtained 
from a tipical wavefront representation from the VTT dataset. 



the wavefront with 20 Zernike polynomials (stan- 
dard order), with 15 K-L functions (MI order) or 
with 7 Zernike polynomials (MI order). 
MI reordering increases the performance for both 
Zernike and Karhunen-Loeve bases. In particular, 
the Zernike MI reordered modal basis shows the 
steepest fitting error function. This is somewhat 
unexpected, because it has been largely shown in 
literature that the standard K-L basis has a bet- 
ter performance than the Zernike one. Our in- 
terpretation is that K-L modes bring high spatial 
frequency information, being defined by PCA as 
a linear combination of all Zernike modes, but, in 
closed loop situations, a deformable mirror reduces 
the phase variance, flattening the distortion am- 
plitudes and thus suppressing the power at high 
spatial frequencies. Therefore, in closed loop, the 
K-L representation may not be as efficient as in 
open loop applications. 

3. WAVEFRONT FORECASTING 



time delay by trying to forecast the next wave- 
front correction. It has been demonstrated that 
WFS measurements are predictable^^ and many 
open-loop demonstrations of the efficiency of such 
predictive methods have been givenP^HZI i n the 
following we make use of three different approaches 
to try to forecast the time series of each Zernike 
coefficient. 

3.1 ARM A Approach 

In this section we introduce the ARMA (Auto Re- 
gressive Moving Average) predictive tools, which 
are a very common instrument in stationary time 
series forecasting. For a deep treatment of this 
topic we suggest the reading oP^ andP^ Here we 
present only a brief introduction. 
Our aim is to find m(X n ): a linear combination 
of the past n values of the Xt time series elements 
which is the "best estimate" of X n+ h,. We define 
as "best estimate" the function which minimizes 
the mean square error: 



We have already introduced fact that the time de- 
lay reduces Adaptive Optics system performances; 
it is therefore extremely desirable to reduce the 



E\X n+h - m(X n )\ 2 



(5) 



where X n+ h is the true value at the time n + h. 
we define ARMA(p,g) processes as follows!^ 



X„ 



yX n . 
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If 3.2 Supervised Neural Network 
Approach 

Another possible approach for forecasting is 
through Neural Networks (NN). While with 
ARMA forecasting we have to estimate the pro- 
Z n +9 n iZ n -i+...+9 nq Zfivss by fitting its ACF, neural networks are a more 



'nq 

(6) 



where Zt is a white noise mean time series and 
4>t and 9 nt represent the parameters of the autore- 
gressive and moving average partsp^ respectively, 
then it can be proved that the best prediction for 
X n+ \ can be written as: 



m(X nH 



iX n + ...+ 



+ &nj {X n +l-j 



- m(X n+1 -j)) 
if 1 < n < m 



This implies that for an ARMA(p,g) process, the 
computation of m(X n+ i) requires the knowledge of 
k = max\p,q] previous values. Since every process 
has a distinctive AutoCovariance Function (ACF), 
this can be used to characterize the process itself: 
we use the ACF to estimate the parameters of the 
process using the moments algorithm for the mov- 
ing average parameters and the Yule- Walker algo- 
rithm for the autoregressive parameters.^ 
Our implementation of the ARMA tool can be 
summarized as follows: 

• Find the best analytical ACF which fits the 
observed data ACF and estimates the ARMA 
parameters fa and 9 n t from that ACF as in.^ 

• Compute the forecast values as a linear combi- 
nation of the past measurements with weights 
given by the ARMA parameters fa and 9 n t- 

The results of the application of this forecasting 
tool are shown in [3? 



flexible paradigm which does not require any as- 
sumption on the process itself. For an introduc- 
tion to NNs we refer to the work ofP^ Our NN 
forecasting procedure is implemented via a three 
layers supervised neural network using the sliding 
window technique and each measurement in the 
sliding window is addressed to one input neuron. 
The network dimension in terms of learning speed, 
accuracy and stability has been chosen in order 
to maximize performance while keeping computa- 
tional time as low as possible. This resulted in a 
network with 10 input neurons, 10 hidden neurons 
and 10 learning patterns. 

In the backpropagation technique learning pro- 
cess,^ k input terms X n _k, ■■■■,X n are fed to the 
network and the output X n+ i is calculated using 
weights which are iteratively adjusted to minimize 
the output error since, in the learning stage, we 
exactly know the output. We train the network 
with 10 patterns, each composed of k = 10 terms 
of the time series to define the neural weights, we 
then use the NN to compute the one-step forward 
prediction X n+ \ of the signal X t . 
As the learning process is made on the data itself, 
it would be necessary to recompute the network 
weights whenever there is a significant change in 
the characteristics of the time series, i.e. it is nec- 
essary to take into account the non-stationarity of 
the physical process. 

The results of the application of this procedure are 
shown in 13. 



3.3 Linear Prediction Approach 

As we have already recalled, in AO systems in 
closed loop operation what normally happens is 
that the wavefront state is measured at time n and 
the correction is applied to the deformable mirror 
at time n + 1, when the wavefront has, in princi- 
ple, changed. The simplest approach, when using 
forecasting methods to counter this time delay, is 



to use the last measured values to find the best lin- 
ear estimation of the process; that is, estimate the 
linear slope from the last two measurements of the 
time series Xt' X n+ \ = aX n + bX n ~i, with a and 
b as weights of the linear relation. In the case of a 
fixed time delay, a = 2 and b = — 1. Such a trivial 
approach is only valid for very short time scales, 
but if a high loop stability is achieved in the AO 
correction, we do not expect high frequency varia- 
tions on the signal. 

We introduced this method and present the results 
in 13. 41 mainly for the purpose of comparing the per- 
formance of the different forecasting tools. 

3.4 LOST simulations 

To compare the behavior of these three prediction 
tools, we implemented them in the LOST (Layer 
Oriented Simulation TooPS) simulation code to 
study how they affect the stability and the qual- 
ity of the AO correction. Given the LOST code 
capabilities of including the effects of Wave-Front 
Sensors (WFSs) on measurements, representing 
the phase delays introduced by the atmospheric 
layers in terms of phase screens and its modular 
design, it is a most convenient choiche to ana- 
lyze the performance of a MultiConjugate Adap- 
tive Optics (MCAO) system in a Multiple Field of 
View (MFoV) approach. 1 ^ The concept of MFoV 
MCAO was introduced inPS The idea is to use 
different WFS, each conjugated to different tur- 
bulence layers as in a layer-oriented approach, but 
with different FoVs for each sensor. The WFS con- 
jugated to the lowest turbulence layer (the ground 
layer) has the widest FoV, while the one conju- 
gated with the highest layer has the narrowest. 
Among other benefits, this approach fully exploits 
the photon flux on the WFS and allows a more 
uniform correction over the FoV. 
The parameters of the LOST simulation we ran for 
the test are summarized in Table [TJ Most of these 
parameters were chosen based on the main char- 
acteristics of current designs for future large solar 
telescopes (such as ATST and EST). The param- 
eters which describe the seeing have been set to 
reproduce a very simple atmosphere, which will 
give us nevertheless a good representation of the 



actual efficiency of the simulated MCAO. In par- 
ticular, in this simulation, there is perfect match- 
ing between the turbulent layers adding to the at- 
mospheric distortions in the wavefront and the de- 
formable mirrors compensating for them. It is also 
worth to note that integration time and loop delay 
(which represent the amount of time allotted for 
the wavefront measure and for the application of 
the computed correction by the deformable mir- 
ror, respectively) together make up the total de- 
lay for the correction of the wavefront. It is this 
total delay that the forecasting methods aim to 
counter. The prediction tools have been imple- 
mented in LOST as filters on the correction signals, 
so that they can compensate for the changes on the 
atmospheric turbulence conditions during the sys- 
tem actuation. It is essential to study their use in 
closed loop conditions because we expected them 
to change the correction signal properties dramat- 
ically, also from a statistical point of view. Four 
LOST simulations were run with the same con- 
ditions, one for each of the forecasting tools and 
one with a standard MFoV MCAO correction. We 
present here a study of both the evolution of the 
maximum Strehl Ratio (SR) value in the FoV and 
the fraction of the FoV in which the AO system is 
achieving a high correction efficiency. 
From the standard MFoV MCAO simulation, we 
find that the ARMA model which fits the best the 
correction time-series in closed loop conditions is 
an ARMA(2,3) model. The 4>t and 9 n t parameter 



Telescope diameter 4 m 

Azimuthal angle deg 

Simulation time step 1 ms 

Number of deformable mirrors 2 

Integration time 1 ms 

Loop delay 0.5 ms 

Number of turbulence layers 2 

Type of turbulence Kolmogorov 

D/ro (lower layer) 26.0 

D/ro (upper layer) 2.0 

Wind speed (lower layer) 7 m/s 

Wind speed (upper layer) 70 m/s 



Table 1. Main parameters of the LOST simulation 
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Figure 2. Maximum SR vs time. Black continuous line: 
simulation without forecasting; red dots: linear predic- 
tor; magenta dashed line: neural network predictor; blue 
dash-dotted line: ARMA predictor. 
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Figure 3. Power spectra of SR after the loop closure. 
Black continuous line: simulation without forecasting; 
red dots: linear predictor; magenta dashed line: neural 
network predictor; blue dash-dotted line: ARMA pre- 
dictor. 



were estimated from the relevant ACF. The stan- 
dard simulation has been used also to train the 
NN: we used 10 patterns of 10 input values each 
taken after the AO loop closure (time > 15 ms). 
In Figure [2] we show the behavior of the SR peak 
function of time for the four simulation runs. 
After a few ms, the loop is firmly closed in all 
the four cases for the whole duration of the sim- 
ulation. The SR is strongly enhanced by all the 
prediction tools (SR > 0.6) with respect to the 
performance without wavefront short-time predic- 
tion (SR = 0.4), but the ARMA tool appears to 
perform best in terms of maximum SR achieved in 
closed loop conditions. 

To deepen the analysis of the prediction tools on 
the loop stability, we studied the power spectrum 
of the oscillations of the SR peak after the loop clo- 
sure (Figure [3]). All three tools are able to reduce 
the power of the oscillations for frequencies above 
70 Hz showing a similar behavior in the power 
spectra. At low frequencies (~ 50 Hz) only the 
ARMA and the NN approaches can damp out the 
SR oscillation power, while the linear prediction 
can only reduce the amplitude with respect to the 
standard correction. 

As mentioned above, one of the most important re- 



quirements of the next generation Solar telescopes 
is a wide corrected FoV reaching 1' — 2'. 
For this reason we also explored the effect of the 
prediction tools in expanding the corrected FoV. 
In Figure [U we show the SR maps for the four dif- 
ferent simulation runs averaged on the whole sim- 
ulation run, but after the loop closure. To help the 
reader, on the SR maps we overplotted SR isocon- 
tours. As already stated, all the three methods can 
reach a higher SR peak, but it is straightforward 
to see how the prediction can greatly enlarge the 
corrected FoV. Tipically, a SR ~ 0.3 is accounted 
as an acceptable value for AO performance, there- 
fore, we refer to the area with a SR > 0.3 as the 
FoV satisfactorily corrected by the MCAO. We 
can estimate this area with an equivalent diam- 
eter: d = ^lA. This diameter is extended from 
d ~ 0'.9 in the standard MCAO to d ~ l'.25 by 
the linear predictor, to d ~ 1'.5 by the NN predic- 
tor, to d ~ 1'.6 by the ARMA predictor. 
From the results of these simulations, it appears 
evident how the prediction tools could enhance the 
loop stability and achieve a better correction in 
terms of corrected FoV and SR peak. In particular, 
the better performance of the NN and ARMA fore- 
casting methods reflects their ability to describe 
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Figure 4. Average SR maps for the four simulation runs with SR isocontours ovcrplottcd. Upper left: standard 
LOST simulation; upper right: simulation with linear predictor; lower left: simulation with ARMA predictor; 
lower right: simulation with neural network predictor. 




Figure 5. VTT data. Left: Zcrnikc 1 coefficient versus time with superimposed the LINEAR PREDICTOR 
prediction; center: Zernike 1 coefficient versus time with superimposed the ARMA prediction; right: Zernike 1 
coefficient versus time with superimposed the NEURAL NETWORK prediction 



the process underlying the wavefront time series 
using a more sofisticated approach. 

4. CONCLUSIONS AND FUTURE 
DEVELOPMENTS 

In this work we have explored three methods to 
reduce the time delay in MCAO wavefront correc- 
tion. This time lag particularly affects the per- 
formances of those systems that cannot rely on 
high contrast or PSF-like sources and have to work 
with a poorly contrasted target as, for example, so- 
lar granulation. In these systems wavefront sens- 
ing requires the application of real-time correlation 
techniques which, in addition to the larger integra- 
tion time needed, increase the time delay between 
measure and correction of the wavefront. 
For these reasons, for the next generation so- 
lar telescopes like EST and ATST, methods are 
needed to reduce the time lag in order to achieve 
the desired MCAO performance. We have shown, 
using MCAO simulations, that the use of predic- 
tion tools can dramatically enhance the MCAO 
performance in terms of corrected FoV, SR peak 
and loop stability. 

A natural development of this approach would 
be the implementation of the prediction schemes 
by using dedicated hardware. Another important 
step is of course the realization of an optical bench 
demonstrator to apply such methods in real con- 
ditions. 

A first step toward this goal has been a prelimi- 
nary test of the prediction tools on a real Zernike 
coefficient time series acquired with the KAOS 
wavefront sensor at the VTT Solar Telescope. We 
present here an example of the performance of the 
three different prediction approaches in forecasting 
the same time series (Figure [5]). Here, we can not 
apply the forecast to the correction, therefore we 
have to verify whether the correction predicted on 
the basis of the measurements at or before time- 
step n is a good representation of the measured 
Zernike coefficient value at time-step n + 1. 
The first plot refers to the linear predictor method, 
which shows a fairly accurate prediction of the sig- 
nal, in fact the correction forecast for time-step 
n + 1 is usually close to the signal at time-step 



n + 1. 

The second plot refers to the ARMA forecast- 
ing. We have analyzed the Zernike coefficient time 
series and fitted its ACF in a time window of 
3500 ms, namely from time step to 7000. The 
model which best represented the time series was 
again an ARMA(2,3) model and it was used for (fit 
and 8 n t parameter estimation. In the plot we show 
how, using only time-steps 6998 — 7000, ARMA is 
able to forecast the values from 7001 to 7010, with 
the associated error. The approach shows a good 
prediction accuracy, at least on short time scales 
(2 — 3 ms) with an increasing error at longer time 
steps. 

In the rightmost plot we show the forecast ob- 
tained by the NN described in 13.21 The learning 
time of the NN ended at time-step 6980 and the 
triangles represent the prediction of the Zernike 
coefficient value X n computed using the previous 
10 values. 

From preliminary results, it appears that the NN 
weights and ARMA parameters are valid only on 
time scales shorter than a few tens of ms after the 
learning time. Consequently, readjusting the NN 
weights and the ARMA parameters (or the whole 
ARMA model) periodically to account for the non- 
stationarity effects of the time series seems to be 
unavoidable. Of course, in both cases, the ratio of 
the time spent between successive recomputations 
and the time required for the recomputation itself 
is the key. 

Since the prediction tools act on the time series of 
coefficients used for the wavefront representation, 
it can be very useful to provide an efficient signal 
compression which can reduce the amount of infor- 
mation to process, while mantaining at the same 
time a high wavefront description accuracy. Us- 
ing information theory, we have also shown that, 
in the case of closed loop operation, there exists 
a more efficient representation than covariance or- 
dererd K-L functions. In particular, using Mutual 
Information, we are able to describe a turbulent 
wavefront with the same errors, but with a lower 
basis dimension by neglecting those terms which 
contribute the least to the wavefront description. 
The next step is therefore to study the possible 



synergy of the two approaches: compressing the 
basis dimensionality and at the same time applying 
forecasting techniques to achieve the lowest possi- 
ble delay in the application of the wavefront cor- 
rection. 
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